Linear Contour Learning: A Method for Supervised Dimension Reduction
نویسندگان
چکیده
We propose a novel approach to sufficient di mension reduction in regression, based on es timating contour directions of negligible vari ation for the response surface. These di rections span the orthogonal complement of the minimal space relevant for the regression, and can be extracted according to a mea sure of the variation in the response, lead ing to General Contour Regression (GCR). In comparison to existing sufficient dimen sion reduction techniques, this contour-based methodology guarantees exhaustive estima tion of the central space under ellipticity of the predictor distribution and very mild ad ditional assumptions, while maintaining fo• consistency and computational ease. Moreover, it proves to be robust to departures from ellipticity. We also establish some use ful population properties for GCR. Simula tions to compare performance with that of standard techniques such as ordinary least squares, sliced inverse regression, principal hessian directions, and sliced average vari ance estimation confirm the advantages an ticipated by theoretical analyses. We also demonstrate the use of contour-based meth ods on a data set concerning grades of stu dents from Massachusetts colleges. Introduction and Background unsupervised approaches; here we consider dimen sion reduction for the regression of a continuous re sponse Y on a vector of continuous predictors X = (X 1 , . .. , X P) T E JRP . Our approach is based on suffi cient dimension reduction, a body of statistical theory and methods for reducing the dimension of X while preserving information on the regression; that is, on the conditional distribution of YjX. A dimension re duction subspace (Cook, 1998) is defined as the column span of any p x d ( d < p) matrix rJ such that
منابع مشابه
Gradient-based kernel dimension reduction for supervised learning
This paper proposes a novel kernel approach to linear dimension reduction for supervised learning. The purpose of the dimension reduction is to find directions in the input space to explain the output as effectively as possible. The proposed method uses an estimator for the gradient of regression function, based on the covariance operators on reproducing kernel Hilbert spaces. In comparison wit...
متن کاملConstructing Interactive Visual Classification, Clustering and Dimension Reduction Models for n-D Data
The exploration of multidimensional datasets of all possible sizes and dimensions is a long-standing challenge in knowledge discovery, machine learning, and visualization. While multiple efficient visualization methods for n-D data analysis exist, the loss of information, occlusion, and clutter continue to be a challenge. This paper proposes and explores a new interactive method for visual disc...
متن کاملمدل ترکیبی تحلیل مؤلفه اصلی احتمالاتی بانظارت در چارچوب کاهش بعد بدون اتلاف برای شناسایی چهره
In this paper, we first proposed the supervised version of probabilistic principal component analysis mixture model. Then, we consider a learning predictive model with projection penalties, as an approach for dimensionality reduction without loss of information for face recognition. In the proposed method, first a local linear underlying manifold of data samples is obtained using the supervised...
متن کاملAnalysis of Correlation Based Dimension Reduction Methods
Dimension reduction is an important topic in data mining and machine learning. Especially dimension reduction combined with feature fusion is an effective preprocessing step when the data are described by multiple feature sets. Canonical Correlation Analysis (CCA) and Discriminative Canonical Correlation Analysis (DCCA) are feature fusion methods based on correlation. However, they are differen...
متن کاملSemi-supervised learning with Gaussian fields
Gaussian fields (GF) have recently received considerable attention for dimension reduction and semi-supervised classification. This paper presents two contributions. First, we show how the GF framework can be used for regression tasks on high-dimensional data. We consider an active learning strategy based on entropy minimization and a maximum likelihood model selection method. Second, we show h...
متن کامل